Automatic Selfie Camera

ECE 5725 Final Project: Cole Gilbert (nsg68) and Audrey Sackey (aas332)

Introduction

Inspired by the concept of gesture-controlled technology, our project aims to design an automatic selfie camera, which takes photos in response to a handclap, by directing the camera in the direction of the handclap. The system makes use of the picamera, piTFT and two microphones for audio input and signal processing. Our design comes in two parts– the piTFT user display, which comprises the UI and all its menu items, and then the controls setup hich consists of two usb microphones, the picamera, and the servos which control the camera.

Objective

The objective of this project is to streamline the process of taking selfies through the integration of modern gesture recognition technology with user-friendly interfaces. By utilizing the capabilities of the Raspberry Pi ecosystem, we aim to create an intuitive and seamless user experience that eliminates the need for physical interaction with the camera. This project not only simplifies the selfie-taking process but also explores the practical applications of sound localization and gesture control in consumer electronics. The end goal is to deliver a robust and reliable system that responds accurately to the user’s gestures, providing an innovative and fun way to capture moments with ease. Also, it’s just really cool to have a camera point directly to you and take your photos when you clap! Below is a diagram of our initial project:

Design and Testing

Capturing the audio input

For accurate signal processing, we connected two microphones to the Raspberry Pi, which capture input from the surroundings for processing. The microphones record the input and save it to a .wav file. The purpose of using two microphones is accuracy, and for directional purposes. We then read the frames from this file and perform an amplitude measurement analysis to distinguish a handclap sound from the ambient noise. Through testing we found that the amplitude of handclaps ranges from about 10,000 to 30,000. Since we are using 16-bit PCM, the audio range for the mics is from -32,768 to 32,767. Therefore, relative to the max PCM value, our handclaps are near the peaks, which is expected. To set reasonable thresholds for handclap detection, we compared the peaks of handclaps to components of ambient noise, such as human speech. Specifically, we compared four handclaps to four spoken words: “Ladies and gentlemen, welcome.”

amplitude vs. time graph with no claps
amplitude vs. time graph with claps

As shown in the images, the handclap amplitudes are distinct and fall within the range of tens of thousands, while human speech amplitudes are within the range of 0-5000. This clear distinction between handclap amplitudes and ambient noise justified our method of filtering using amplitude. Our program reads the frames from the file continuously every second to ensure we don't miss a handclap and to enable fast, streamlined signal processing. The output .wav file is updated every second as well to incorporate new audio input.

Using the input to control the motors

We established a threshold of 15,000 for detecting a handclap, through repeated trials of hand claps. Although the image above representing the handclap peaks indicate some handclap peaks may fall slightly below this threshold, these were typically associated with very weak handclaps. We chose to optimize our system to detect strong handclaps to effectively filter out environmental noise that might include weaker claps. To determine the direction of the handclaps, we used two microphones. While both microphones receive the same input, their amplitude readings can vary based on their proximity to the handclap source. Our program analyzes these amplitude differences to infer the direction of the sound. Through extensive testing we figured out that if the amplitude differences between the microphones are minimal, that is, between 400 and 7000, we infer that the handclap source is nearly equidistant from both microphones, and the servos are programmed to keep the camera centered. When the left microphone records a significantly higher amplitude, outside this range, the program directs the servo to move the camera to the left. Conversely, if the right microphone records a higher amplitude, the servo moves the camera to the right. This method ensures accurate and responsive camera positioning.

Taking a picture

Our program is designed to take a picture after the servos have performed specific movements. For instance, in a left-turn movement, the horizontal servo first rotates the camera to the left, and then the vertical servo housing the PiCamera tilts downward to position the camera towards the subject. Once the camera is correctly positioned, the PiCamera is accessed to capture the photo. After the picture is taken, the vertical servo tilts back up, and both servos return to the center position. The PiCamera interfaces seamlessly with the rest of the system through the Raspberry Pi. The control logic for the servos and the PiCamera is managed by a Python script running on the Raspberry Pi. This script uses libraries such as ‘RPi.GPIO’ for servo control and picamera for interfacing with the PiCamera. One key thing we ensure is that there is a reasonable delay during the entire motion and picture-taking process to avoid blurred photos. This delay allows the servos to stabilize and the PiCamera to focus properly, ensuring clear and sharp images. This integrated approach ensures that the system operates smoothly and efficiently, providing a seamless user experience.

Sending an email

One key element of our selfie camera was ensuring our users could have access to the photos that they took on the device. We implemented this with a program that would email the user an image that they selected out of the many images they took. To get this to work we first made an email address for the project: “ece5725selfiecam@gmail.com”. We then used an SMTP (Simple Mail Transfer Protocol) server to send emails via python code - making sure to attach the selected image.

User interface

The glue that connects all of these components together is the user interface, which is a combination of the PiTFT display, the PiTFT touchscreen functionality, and the various possible states for the automatic selfie camera. We used PyGame, Pigame, and PiTFTTouchscreen extensively to allow for touch interactions with the screen. Every touch was registered and based on the location and the current state, would have a different action. To test all of the state transitions, we made a state transition diagram and utilized lots of print statements to see where each button press would take the user.

  • The first state is the home screen which features the start and quit buttons, these allow the user to progress to the instructions screen, or to quit the program. This state serves as a home base of sorts for the automatic selfie camera.
  • The second state is the instructions screen which includes instructions for how to operate the selfie camera. It also has four buttons: library, email, camera, and back. The back button just sends the user to the previous state.
  • The third state is the library state, reachable by pressing the library button on the instructions screen. This state includes the complete photo library, which allows users to progress through all the photos that have been taken during this instantiation of the automatic selfie camera. Users will press the ‘prev’ button to see the previous image, the ‘next’ button to see the next image, and the select button to select their favorite photo and return to the instructions screen. Note that the photo library is cyclic, i.e. if you reach the last photo and press ‘next’ it will take you back to the first photo. We are able to navigate through all the photos that have been taken thus far via a clever naming system: each photo is saved as “selfieX.jpg” where X is a number. This number starts at 0 and increases every time you take a photo, so when you have taken 5 photos, you would have: selfie0.jpg, selfie1.jpg, selfie2.jpg, selfie3.jpg, and selfie4.jpg stored in the library. The ‘prev’ and ‘next’ buttons that progress through the library simply increment or decrement a counter (wrapping it around when it is less than 0 or equal to the number of images in the library) and this counter is then concatenated to the end of a string that flashes the corresponding image onto the screen. The select button simply saves the selected image’s index and returns the user to the instructions page, similar to the back button.
  • The fourth state is the email state, reachable by pressing the email button on the instructions screen. This state simply includes a back button (that sends the user to the previous state) and a text box where you can enter in your email using the attached keyboard. Once you have entered in a valid email address, pressing enter will send the email address a message with the selected photo and it will return the user to the home screen.
  • The fifth and final state is the camera state, reachable by pressing the camera button on the instructions screen. The camera state only has a back button but it activates the sound locating camera. In this state, anytime you clap, the camera will angle towards the clap, take a photo, and save it to the library.

To visualize these transitions, please see the demo portion of our project’s video and the state transition diagram below:

Issues encountered

Although we successfully completed the project, we encountered a notable challenge concerning microphone selection. Initially, we employed a pair of non-USB microphones coupled with a bandpass filter circuit. This setup interfaced with an analog-to-digital converter before connecting to the Raspberry Pi. We faced difficulties in obtaining accurate inputs and observed minimal change in audio signals upon handclap detection.Given that our project's essence relied on discerning handclaps amidst ambient noise, it became apparent the mics needed to be changed. Despite experimenting with alternative microphones, we realized that our bandpass filter might have been the root cause of our problems. We discarded the filter and decided to do our frequency filtering on the software end. This still did not fix our issue. Ultimately, we transitioned to USB microphones, a decision that significantly streamlined our setup and provided clear and easily interpretable data. Below is a diagram of our original bandpass filter:

amplitude vs. time graph with no claps

We also encountered a major issue with sending emails from a custom email account. We had trouble getting around two-factor authentication for the email accounts we created, which is standard for all accounts (we tried gmail, aol, and yahoo). Through extensive research we were able to find that gmail supports 3rd party app sign in, which allowed us to pass a specific key that could verify we were using a trusted service without having to authenticate it.

Despite these setbacks, the remainder of our project progressed smoothly. This experience emphasized the importance of meticulous hardware selection and iterative problem-solving during the development process.

Results

Here is a brief overview of how all the different parts of our finalized automatic selfie camera's system interact with each other:

  • User Interface: selects the state which determines what actions are able to be done at that moment.
  • Microphone Input: The microphones capture audio signals, which are processed to detect a handclap and determine its direction.
  • Servo Movement: Based on the direction inferred from the audio signal, the servos adjust their positions to orient the camera towards the source of the sound.
  • PiCamera Activation: Once the servos have moved to the desired position, the Python script commands the PiCamera to take a picture.
  • Servo Reset: After the picture is taken, the servos return to their default positions to be ready for the next handclap
  • Email Integration: Once the user selects an image and enters their email address, an email is sent from a custom email address: “ece5725selfiecam@gmail.com”.

We used three main codes to control our entire project: start_screen.py (Figure 1), mic_loop.sh (Figure 2), and start_proj.sh (Figure 3). These codes more broadly controlled all of the actions and tasks listed previously.

start_screen.py

  • Controls user interface visual display
  • Controls user interface state changes
  • Controls the camera assembly (servo motors and the picamera)
  • Takes photos and saves them in a directory
  • Sends emails with selected photos
  • Processes audio from .wav files

mic_loop.sh

  • Polls microphones for audio every second and updates .wav files

start_proj.sh

  • Clears photo library
  • Runs mic_loop.sh in the background
  • Runs start_screen.py in the foreground

The project's objective was to deliver a robust and reliable selfie camera system that accurately responds to user gestures, providing an innovative and fun way to capture moments with ease. We are proud to have achieved this goal despite encountering various challenges along the way. Collaborating to solve these issues encouraged teamwork, helped us develop problem-solving strategies, and improve our Python code debugging and Linux troubleshooting skills. Overall we consider this project a very successful endeavor both for the outcome of the automatic selfie camera and for ourselves as engineers.

Conclusions and Acknowledgements

Overall, we are satisfied with what the automatic selfie camera, and glad we were able to bring our idea to life successfully. Our system is able to successfully detect handclaps, and take a picture in the direction of the handclap. take The entire process of working of this project was eye opening, and insightful. One noteworthy thing we learned was that introducing analog signals for processing over GPIO pins, even with the help of an ADC isn’t always effective. Despite our setbacks, we consider this project a very successful endeavor both for the outcome of the automatic selfie camera and for ourselves as engineers.

We extend our heartfelt gratitude to Professor Joe Skovira for his guidance and support throughout the project. His expertise and encouragement played a pivotal role in shaping our ideas and overcoming obstacles. We also want to acknowledge the invaluable assistance provided by the teaching assistants, whose feedback and support were instrumental in our project's success.

Future Work

If we had more time to work on the project, we would have pursued a more robust framework for our directional movement. Our current system only detects handclaps in three directions–left, right and center and consequently, the camera only moves in those three directions. Implementing a more fine tuned directional movement over a wide range of angles would be a great extension of our project.

Division of Labor

Audrey and Cole both contributed equally to this project, however they worked on tasks that suited their strengths. Audrey wired the entire system together, worked with several different kinds of microphones, found the desired thresholds for the clap amplitude, and was a positive influence on the team in general. Cole worked on the user interface, the servo directionality calculations for locating the clap, and he developed the website.

amplitude vs. time graph with no claps

Bill of Materials/Parts List

  • Raspberry Pi 4, 2G Ram
  • Capacitive piTFT
  • SD card
  • Raspberry Pi power supply charger
  • 2 grove microphones
  • 1 motor
  • 1 Camera
  • 3 usb microphones
  • 2 breadboards
  • 1 arduino uno
  • 5 adafruit microphones
  • 1 ads1115 adc

No parts were purchased. All these were acquired from the lab, with the exception of the adc, which was provided by a team member.

References

Source 1: Picamera Raspberry pi blog
URL: https://www.tomshardware.com/how-to/use-picamera2-take-photos-with-raspberry-pi
About: Used camera code from this site

Source 2: Servo Motor Raspberry pi blog
URL: https://www.instructables.com/Controlling-Servo-Motor-Sg90-With-Raspberry-Pi-4/
About: used servo motor code from this site

Source 3: Lab 2 files
URL: https://canvas.cornell.edu/courses/61463/files/10072024?module_item_id=2503913
About: used the pitft setup code from Lab2 and the screen_coordinates as a foundation for our user interface

Source 4: Mailtrap Gmail SMTP tutorial
URL: https://mailtrap.io/blog/gmail-smtp/
About: followed instructions for SMTP setup via gmail account

Source 5: ECE 5725 student projects website
URL: https://skovira.ece.cornell.edu/ece5725-fall-2023-projects/
About: referenced the PiNotes project for email functionality and the Guitar Hero project for FFT documentation.

Source 6: ADC integration guide for Raspberry Pi
https://how2electronics.com/how-to-use-ads1115-16-bit-adc-module-with-raspberry-pi/
About: steps to use an ADC with RPi

Source 7: Microphone library
https://wiki.seeedstudio.com/Grove-Sound_Sensor/
About: grove sound sensor library/wiki

Source 8: lab2 handout
https://canvas.cornell.edu/courses/61463/files/10072024?module_item_id=2503913
About: troubleshooting pitft screen

Source 9: frequency analysis GitHub
https://github.com/Vaishd30/Frequency-Analysis-of-Audio-file-using-Raspberry-Pi/
About: frequency analysis code for USB microphones